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Description 

[0001 ] The present invention relates to the field of data compression and, more particularly, to a method and a system 
and techniques for compressing digital motion video signals in keeping with algorithms similar to the emerging MPEG 
5 standard proposed by the International Standards Organization's Moving Picture Experts Group (MPEG). 

[0002] Technological advances in digital transmission networks, digital storage media, very Large Scale Integration 
devices, and digital processing of video and audio signals are converging to make the transmission and storage of dig- 
ital video economical in a wide variety of applications. Because the storage and transmission of digital video signals is 
central to many applications, and because an uncompressed representation of a video signal requires a large amount 
10 of storage, the use of digital video compression techniques is vital to this advancing art. In this regard, several interna- 
tional standards for the compression of digital video signals have emerged over the past decade, with more currently 
under development. These standards apply to algorithms for the transmission and storage of compressed digital video 
in a variety of applications, including: video-telephony and teleconferencing; high quality digital television transmission 
on coaxial and fiber-optic networks as well as broadcast terrestrially and over direct broadcast satellites; and in interac- 
ts tive multimedia products on CD-ROM, Digital Audio Tape, and Winchester disk drives. 

[0003] Several of these standards involve algorithms based on a common core of compression techniques, e.g., the 
CCITT (Consultative Committee on International Telegraphy and Telephony) Recommendation IL120, the CCITT Rec- 
ommendation IL261 , and the ISO/I EC MPEG standard. The MPEG algorithm, has been developed by the Moving Pic- 
ture Experts Group (MPEG), part of a joint technical committee of the International Standards Organization (ISO) and 
20 the International Electrotechnical Commission (IEC). The MPEG committee has been developing a standard for the 
multiplexed, compressed representation of video and associated audio signals. The standard specifies the syntax of the 
compressed bit stream and the method of decoding, but leaves considerable latitude for novelty and variety in the algo- 
rithm employed in the encoder. 

[0004] As the present invention may be applied in connection with such an encoder, in order to facilitate an under- 
25 standing of the invention, some pertinent aspects of the MPEG video compression algorithm will be reviewed. It is to be 
noted, however, that the invention can also be applied to other video coding algorithms which share some of the fea- 
tures of the MPEG algorithm. 

The MPEG Video Compression Algorithm 

30 

[0005] To begin with, it will be understood that the compression of any data object, such as a page of text, an image, 
a segment of speech, or a video sequence, can be thought of as a series of steps, including: 1 ) a decomposition of that 
object into a collection of tokens: 2) the representation of those tokens by binary strings which have minimal length in 
some sense; and 3) the concatenation of the strings in a well-defined order. Steps 2 and 3 are lossless, i.e., the original 

35 data is faithfully recoverable upon reversal, and Step 2 is known as entropy coding. (See. e.g.. T. BERGER. Rate Dis- 
tortion Theory, Englewood Cliffs, NJ: Prentice-Hall, 1977; R. McELIECE, The Theory of Information and Coding, Read- 
ing, MA: Addison-Wesley, 1971; D.A. HUFFMAN, "A Method for the Construction of Minimum Redundancy Codes," 
Proc. IRE, pp. 1098-1101, September 1952; G.G. LANGDON, "An Introduction to Arithmetic Coding," IBM J. Res. 
Develop., vol. 28, pp. 1 35-1 49, March 1 984). Step 1 can be either lossless or lossy in general. Most video compression 

40 algorithms are lossy because of stringent bit-rate requirements. A successful lossy compression algorithm eliminates 
redundant and irrelevant information, allowing relatively large errors where they are not likely to be visually significant 
and carefully representing aspects of a sequence to which the human observer is very sensitive. The techniques 
employed in the MPEG algorithm for Step 1 can be described as predictive/interpolative motion-compensated hybrid 
DCT/DPCM coding. Huffman coding, also known as variable length coding (see the above- cited HUFFMAN 1952 

45 paper) is used in Step 2. Although, as mentioned, the MPEG standard is really a specification of the decoder and the 
compressed bit stream syntax, the following description of the MPEG specification is, for ease of presentation, primarily 
from an encoder point of view. 

[0006] The MPEG video standard specifies a coded representation of video for digital storage media, as set forth in 
ISO-IEC JTC1/SC2/WG1 1 MPEG CD-1 1 1 72, MPEG Committee Draft, 1991. The algorithm is designed to operate on 
so noninterlaced component video. Each picture has three components: luminance (Y), red color difference (C r ), and blue 
color difference (C b ). The C r and C b components each have half as many samples as the Y component in both hori- 
zontal and vertical directions. Aside from this stipulation on input data format, no restrictions are placed on the amount 
or nature of pre-processing that may be performed on source video sequences as preparation for compression. Meth- 
ods for such pre-processing are one object of this invention. 

55 

Layered Structure of an MPEG Sequence 

[0007] An MPEG data stream consists of a video stream and an audio stream which are packed, together with sys- 
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tems information and possibly other bitstreams, into a systems data stream that can be regarded as layered. Within the 
video layer of the MPEG data stream, the compressed data is further layered. A description of the organization of the 
layers will aid in understanding the invention. These layers of the MPEG Video Layered Structure are shown in Figures 
1-4. Specifically the Figures show: 

5 

Figure 1 : Exemplary pair of Groups of Pictures (GOP's). 
Figure 2: Exemplary macroblock (MB) subdivision of a picture 
Figure 3: Exemplary slice subdivision of a picture. 
Figure 4: Block subdivision of a macroblock. 

10 

[0008] The layers pertain to the operation of the compression algorithm as well as the composition of a compressed 
bit stream. The highest layer is the Video Sequence Layer, containing control information and parameters for the entire 
sequence. At the next layer, a sequence is subdivided into sets of consecutive pictures, each known as a Group of 
Pictures (GOP). A general illustration of this layer is shown in Figure 1. Decoding may begin at the start of any GOP, 
15 essentially independent of the preceding GOP's. There is no limit to the number of pictures which may be in a GOP, nor 
do there have to be equal numbers of pictures in all GOP's. 

[0009] The third or Picture layer is a single picture. A general illustration of this layer is shown in Figure 2. The lumi- 
nance component of each picture is subdivided into 16x16 regions; the color difference components are subdivided 
into 8x8 regions spatially co-sited with the 1 6 x 1 6 luminance regions. Taken together, these co-sited luminance region 
20 and color difference regions make up the fifth layer, known as a macroblock (MB). Macroblocks in a picture are num- 
bered consecutively in lexicographic order, starting with Macroblock 1 . 

[0010] Between the Picture and MB layers is the fourth or slice layer. Each slice consists of some number of consec- 
utive MB's. Slices need not be uniform in size within a picture or from picture to picture. They may be only a few mac- 
roblocks in size or extend across multiple rows of MB's as shown in Figure 3. 
25 [001 1] Finally, each MB consists of four 8x8 luminance blocks and two 8x8 chrominance blocks as seen in Figure 
4. If the width of each luminance picture (in picture elements or pixels) is denoted as C and the height as ft (C is for 
columns, ft is for rows), a picture is C MB = C/16 MB's wide and R MB = ft/16 MB's high. Similarly, it is C B = C/8 
blocks wide and R B = ft/8 blocks high. 

[0012] The Sequence, GOP, Picture, and slice layers all have headers associated with them. The headers begin with 
30 byte-aligned Start Codes and contain information pertinent to the data contained in the corresponding layer. 

[0013] Within a GOP, three types of pictures can appear. The distinguishing difference among the picture types is the 
compression method used. The first type, Intramode pictures or l-pictures, are compressed independently of any other 
picture. Although there is no fixed upper bound on the distance between l-pictures, it is expected that they will be inter- 
spersed frequently throughout a sequence to facilitate random access and other special modes of operation. Each 
35 GOP must start with an l-picture and additional l-pictures can appear within the GOP. The other two types of pictures, 
predictively motion-compensated pictures (P-pictures) and bidirectionally motion-compensated pictures (B-pictures), 
will be described in the discussion on motion compensation below. 

[001 4] Certain rules apply as to the number and order of I-, P-, and B-pictures in a GOP. Referring to I- and P-pictures 
collectively as anchor pictures, a GOP must contain at least one anchor picture, and may contain more. In addition, 
40 between each adjacent pair of anchor pictures, there may be zero or more B-pictures. An illustration of a typical GOP 
is shown in Figure 5. 

Macroblock Coding in l-pictures 

45 [0015] One very useful image compression technique is transform coding. (See N.S. JAYANT and P. NOLL, Digital 
Coding of Waveforms, Principles and Applications to Speech and Video, Englewood Cliffs, NJ: Prentice-Hall, 1 984, and 
A.G. TESCHER, "Transform Image Coding," in W.K. Pratt, editor. Image Transmission Techniques, pp. 113-155, New 
York, NY: Academic Press, 1979.) In MPEG and several other compression standards, the discrete cosine transform 
(DCT) is the transform of choice. (See K.R. RAO and P. YIP, Discrete Cosine Transform, Algorithms, Advantages, 

so Applications, San Diego, CA: Academic Press, 1 990, and N. AHMED, T. NATARAJAN, and K. R. RAO, "Discrete Cosine 
Transform," IEEE Transactions on Computers, pp. 90-93, January 1974). The compression of an l-picture is achieved 
by the steps of 1) taking the DCT of blocks of pixels, 2) quantizing the DCT coefficients, and 3) Huffman coding the 
result. In MPEG, the DCT operation converts a block of n x n pixels into an nxn set of transform coefficients. Like sev- 
eral of the international compression standards, the MPEG algorithm uses a DCT block size of 8 x 8. The DCT trans- 

55 formation by itself is a lossless operation, which can be inverted to within the precision of the computing device and the 
algorithm with which it is performed. 

[0016] The second step, quantization of the DCT coefficients, is the primary source of lossiness in the MPEG algo- 
rithm. Denoting the elements of the two-dimensional array of DCT coefficients by csubmn, where m and n can range 
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from 0 to 7; aside from truncation or rounding corrections, quantization is achieved by dividing each DCT coefficient c mn 
by w mn x QP, with w mn being a weighting factor and QP being the quantizer parameter. Note that QP is applied to each 
DCT coefficient. The weighting factor w mn allows coarser quantization to be applied to the less visually significant coef- 
ficients. There can be two sets of these weights, one for l-pictures and the other for P- and B-pictures. Custom weights 

5 may be transmitted in the video sequence layer, or defaults values may be used. The quantizer parameter OP is the 
primary means of trading off quality vs. bit-rate in MPEG. It is important to note that QP can vary from MB to MB within 
a picture. This feature, known as adaptive quantization (AQ), permits different regions of each picture to be quantized 
with different step-sizes, and can be used to attempt to equalize (and optimize) the visual quality over each picture and 
from picture to picture. Although the MPEG standard allows adaptive quantization, algorithms which consist of rules for 

10 the use of AQ to improve visual quality are not subject to standardization. A class of rules for AQ is one object of this 
invention. 

[0017] Following quantization, the DCT coefficient information for each MB is organized and coded, using a set of 
Huffman codes. As the details of this step are not essential to an understanding of the invention and are generally 
understood in the art, no further description will be offered here. For further information in this regard reference may be 
15 had to the previously-cited HUFFMAN 1 952 paper. 

Motion Compensation 

[0018] Most video sequences exhibit a high degree of correlation between consecutive pictures. A useful method to 
20 remove this redundancy prior to coding a picture is "motion compensation". Motion compensation requires some 
means for modeling and estimating the motion in a scene. In MPEG, each picture is partitioned into macroblocks and 
each MB is compared to 16 x 16 regions in the same general spatial location in a predicting picture or pictures. The 
region in the predicting picture(s) that best matches the MB in some sense is used as the prediction. The difference 
between the spatial location of the MB and that of it's predictor is referred to as a motion vector. Thus, the outputs of 
25 the motion estimation and compensation for an MB are motion vectors and a motion-compensated difference macrob- 
lock. In compressed form, these generally require fewer bits than the original MB itself. Pictures which are predictively 
motion-compensated using a single predicting picture in the past are known as P-pictures. This kind of prediction is also 
referred to in MPEG as forward-in-time prediction. 

[0019] As discussed previously, the time interval between a P-picture and its predicting picture can be greater than 
30 one picture interval. For pictures that fall between P-pictures or between a l-picture and a P-picture, backward-in-time 
prediction may be used in addition to forward-in-time prediction (see Figure 5). Such pictures are known as bidirection- 
ally motion-compensated pictures, B-pictures. For B-pictures, in addition to forward and backward prediction, interpola- 
tive motion compensation is allowed in which the predictor is an average of a block from the previous predicting picture 
and a block from the future predicting picture. In this case, two motion vectors are needed. 
35 [0020] The use of bidirectional motion compensation leads to a two-level motion compensation structure, as depicted 
in Figure 5. Each arrow indicates the prediction of the picture touching the arrowhead using the picture touching the dot. 
Each P-picture is motion-compensated using the previous anchor picture (l-picture or P-picture, as the case may be). 
Each B-picture is motion-compensated by the anchor pictures immediately before and after it. No limit is specified in 
MPEG on the distance between anchor pictures, nor on the distance between l-pictures. In fact, these parameters do 
40 not have to be constant over an entire sequence. Referring to the distance between l-pictures as N and to the distance 
between P-pictures as M, the sequence shown in Figure 5 has (N,M)=(9,3). In coding the three picture types, different 
amounts of compressed data are required to attain similar levels of reconstructed picture quality. The exact ratios 
depend on many things, including the amount of spatial detail in the sequence, and the amount and compensability of 
motion in the sequence. 

45 [0021] It should therefore be understood that an MPEG-1 sequence consists of a series of l-pictures which may have 
none or one or more P-pictures sandwiched between them. The various I- and P-pictures may have no B-pictures or 
one or more B-pictures sandwiched between them, in which latter event they operate as anchor pictures. 

Macroblock Coding in P-pictures and B-pictures 

50 

[0022] It will be appreciated that there are three kinds of motion compensation which may be applied to MB's in B- 
pictures: forward, backward, and interpolative. The encoder must select one of these modes. For some MBs, none of 
the motion compensation modes yields an accurate prediction. In such cases, the MB may be processed in the same 
fashion as a macroblock in an l-picture, i.e., as an intramode MB). This is another possible MB mode. Thus, there are 
55 a variety of MB modes for P- and B-pictures. 

[0023] Aside from the need to code side information relating to the MB mode used to code each MB and any motion 
vectors associated with that mode, the coding of motion-compensated macroblocks is very similar to that of intramode 
MBs. Although there is a small difference in the quantization, the model of division by w mn x OP still holds. Further- 
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more, adaptive quantization (AQ) may be used. 
Rate Control 

5 [0024] The MPEG algorithm is intended to be used primarily with fixed bit-rate storage media. However, the number 
of bits in each picture will not be exactly constant, due to the different types of picture processing, as well as the inherent 
variation with time of the spatio-temporal complexity of the scene being coded. The MPEG algorithm uses a buffer- 
based rate control strategy to put meaningful bounds on the variation allowed in the bit-rate. A Video Buffer Verifier 
(VBV) is devised in the form of a virtual buffer, whose sole task is to place bounds on the number of bits used to code 

10 each picture so that the overall bit-rate equals the target allocation and the short-term deviation from the target is 
bounded. This rate control scheme can be explained as follows. Consider a system consisting of a buffer followed by a 
hypothetical decoder. The buffer is filled at a constant bit-rate with compressed data in a bit stream from the storage 
medium. Both the buffer size and the bit-rate are parameters which are transmitted in the compressed bit stream. After 
an initial delay, which is also derived from information in the bit stream, the hypothetical decoder instantaneously 

15 removes from the buffer all of the data associated with the first picture. Thereafter, at intervals equal to the picture rate 
of the sequence, the decoder removes all data associated with the earliest picture in the buffer. In order that the bit 
stream satisfy the MPEG rate control requirements, it is necessary that all the data for each picture is available within 
the buffer at the instant it is needed by the decoder. This requirement translates to upper and lower bounds ( u VRV and 
L VRV ) on the number of bits allowed in each picture. The upper and lower bounds for a given picture depend on the 

20 number of bits used in all the pictures preceding it. It is the function of the encoder to produce bit streams which satisfy 
this requirement. It is not expected that actual decoders will be configured or operate in the manner described above. 
The hypothetical decoder and it's associated buffer are simply a means of placing computable limits on the size of com- 
pressed pictures. 

[0025] One important function of an MPEG encoder is to ensure that the video bitstream it produces satisfies these 
25 bounds. There are no other restrictions on the number of bits used to code the pictures in a sequence. This latitude 
should be used to allocate the bits in such a way as to equalize (and optimize) the visual quality of the resulting recon- 
structed pictures. A solution to this bit allocation problem is another object of this invention. 

THE PROBLEM 

30 

[0026] It should be understood, therefore, from the foregoing description of the MPEG algorithm, that the purpose of 
the MPEG standard is to specify the syntax of the compressed bit stream and the methods used to decode it. Consid- 
erable latitude is afforded encoder algorithm and hardware designers to tailor their systems to the specific needs of their 
application. The degree of complexity in the encoder can be traded off against the visual quality at a particular bit-rate 

35 to suit specific applications. A large variety of compressed bit-rates and image sizes are also possible. This will accom- 
modate applications ranging from low bit-rate videophones up to full-screen multimedia presentations with quality com- 
parable to VHS videocassette recordings. Consequently, the problem to which the present invention is addressed is 
achieving compression of digital video sequences in accordance with the MPEG standard, applying techniques of the 
type discussed above using adaptive quantization and bit-rate control in a manner that optimizes the visual quality of 

40 the compressed sequence while ensuring that the bit stream satisfies the MPEG fixed bit-rate requirements. 

PRIOR ART 

[0027] In the open literature, a number of schemes have appeared which address certain aspects of the problem of 
45 adaptive quantization and bit-rate control. For example, W-H CHEN and W.K. PRATT, in their paper, "Scene Adaptive 
Coder," IEEE Trans. Communications, vol. COM-32, pp. 225-232. March 1984, discuss the idea of a rate-controlled 
quantization factor for transform coefficients. The rate control strategy used there is commonly applied in image and 
video compression algorithms to match the variable bit-rate produced when coding to a constant bit-rate channel. In this 
case, each picture is treated in the sequence as if it were a still picture, and it essentially strives to achieve a constant 
so picture bit allocation for every picture in the sequence. More details on such techniques can be found in the above-cited 
TESCHER 1979 book chapter. 

[0028] Although the CHEN and PRATT 1984 paper deals with image coding, the ideas set forth therein would be 
applicable to video coding as well. However, there is no mechanism for adapting the quantization factor according to 
the nature of the images themselves. 
55 [0029] C-T. CHEN and D.J. LeGALL describe an adaptive scheme for selecting the quantization factor based on the 
magnitude of the k - th largest DCT coefficient in each block in their article "A K-th Order Adaptive Transform Coding 
Algorithm for Image Data Compression," SPIE Vol. 1153, Applications of Digital Image Processing XIL vol. 1 153, pp, 7- 
18, 1989. 
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[0030] H. LOHSCHELLER proposes a technique for classifying blocks in "A Subjectively Adapted Image Communi- 
cation System," IEEE Trans. Communications, vol. COM-32, pp. 1316-1322, December 1984. This technique is related 
to adaptive zonal sampling and adaptive vector quantization. 

[0031] K.N. NGAN, K.S. LEONG, AND H. SINGH, in "A HVS-weighted Cosine Transform Coding Scheme with Adap- 
5 tive Quantization," SPIE Vol. 1001 Visual Communications and Image Processing, vol. 1001, pp. 702-708, 1988, pro- 
pose an adaptive quantizing transform image coding scheme in which a rate controlling buffer and the contrast of the 
DC term of each block with respect to its nearest neighbor blocks in raster scan order are used in combination to adapt 
the quantizer factor. 

[0032] H. HOELZLWIMMER, discusses in "Rate Control in Variable Transmission Rate Image Coders," SPIE Vol. 
10 1 1 53 Applications of Digital Image Processing XII, vol. 1 1 53, pp. 77-89, 1 989, a combined bit-rate and quality controller. 
Two parameters are used to control the reconstruction error and bit-rate, quantizer step size and spatial resolution. A 
spatial domain weighted mean square error measure is used to control the parameters. 

[0033] Co-pending application U.S. Ser.No.705,234, filed May 24, 1991 by the present inventors addresses the prob- 
lem of adaptive quantization. The techniques disclosed therein can be used as one of the subsystems in the present 

15 invention, that is, the Adaptive-quantizing Rate-controlled (AQ/RC) Picture Coder. 

[0034] US Patent 5 051 840 discloses a picture signal coding device which divides picture data into a plurality of 
blocks and subjects the individual blocks of picture data to an orthogonal transform, a normalization, and a coding. As 
is particularly disclosed in column 3, line 26 to 32 the compression coding is directed to STILL PICTURE DATA. The 
amounts of data to be allocated to the individual blocks are determined on the basis of the ratios of activities of the indi- 

20 vidual blocks to a sum of the block-by-block activities, whereby the output of the individual block of coded data is 
restricted. The amount of data is allocated in matching relation to the frequency components of the individual blocks. 
Further, the number of bits to be assigned to the entire picture is maintained constant. 

OBJECT 

25 

[0035] In contrast to the foregoing prior art systems and algorithm, it is an object of the present invention to provide 
a method and a system for allocating bits among compressed pictures in an M PEG II data stream, which applies to 
video compression algorighms intended to produce a fixed-bit-rate compressed data stream. 

30 Summary of the Invention 

[0036] The present invention discloses a method and a system for performing a picture bit allocation in accordance 
with claims 1 and 6 respectively. 

[0037] The method and the system are applicable to video compression algorithms intended to produce a f ixed-bit- 
35 rate compressed data stream, and in which motion compensation is employed. 

[0038] This method of allocating bits among the successive pictures in a video sequence results in a visual quality 
from picture to picture, while meeting the MPEG Video Buffer Verifier (VBV) bit-rate limitations. 

Brief Description of the Drawings 

40 

[0039] 

Figures 1- 4 illustrate layers of compressed data within the video compression layer of the MPEG data stream; in 
particular, Figure 1 depicts an exemplary set of Groups of Pictures (GOP's), Figure 2 depicts an exemplary Mac- 
45 roblock (MB) subdivision of a picture. Figure 3 depicts an exemplary Slice subdivision of a frame or picture, and Fig- 

ure 4 depicts the Block subdivision of a Macroblock. 

Figure 5 illustrates the two-level motion compensation among pictures in a GOP employed in MPEG. 

so Figure 6 is a block diagram of an MPEG encoder incorporating three component sub-systems for implementing 

techniques in accordance with the present invention. 

Figure 7 shows the coding difficulty factors for the entire sequence of pictures in a video sequence, composed of 
two test sequences used in the MPEG standards effort, including the first 60 frames of the Flower Garden 
55 sequence, followed by the first 60 frames of the Table Tennis sequence, followed by 30 repetitions of the 61-st frame 

of Tennis Table (to simulate a still scene), and used throughout the description to illustrate the methods of the inven- 
tion. 
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Figure 8 depicts the bit allocations computed for each picture of the sequence of Figure 7. 

Figure 9 depicts the target and actual bit-rates for each picture of the sequence of Figure 7. 

5 Figure 1 0 is a plot of the quantization (QP) factors used to code the sequence of Figure 7. 

Figure 1 1 is a block diagram showing in more detail the AQ/RC Picture Coder subsystem of Figure 6. 

Figure 12 depicts typical class distributions for I- and P-pictures taken from both the Flower Garden and Table Ten- 
w nis segments of the MPEG test sequences. 

Figures 13 and 14 depict the performance of the QP assignment and update strategies in bit-rate control in accord- 
ance with the invention with Figure 13 showing the QP/ 0W and average QP in each row of Frames 16, 22, 61, and 
67 of a test sequence, and Figure 14 showing the bits produced versus the targets on a row by row basis. 

15 

Figure 15 depicts the details of the QP-Adaptive Pre-processor shown in Figure 6. 

Figure 16 depicts three possible filter states (FS) of the QP-Adaptive Pre-processor shown in Figure 15. 

20 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

[0040] Preliminarily, as noted above, an important feature of the ISO/IEC MPEG standard is that only the syntax of 
the compressed bit stream and the method of decoding it are specified in detail. Therefore, it is possible to have differ- 
ent encoders, all of which produce bit streams compatible with the syntax of the standard, but which are of different 

25 complexities, and result in different levels of visual quality at a given bit-rate. The MPEG standard applies primarily, but 
not exclusively, to situations in which the average bit-rate of the compressed data stream is fixed. The MPEG specifica- 
tion contains a precise definition of the term "fixed bit-rate". However, even though the average rate must be constant, 
the number of bits allocated to each picture in an MPEG video sequence does not have to be the same for all pictures. 
Furthermore, allocation of bits within a picture does not have to be uniform. Part of the challenge in designing an 

30 encoder that produces high quality sequences at low bit-rates is developing a technique to allocate the total bit budget 
among pictures and within a picture. 

[0041] Also to be kept in mind is another coding feature of importance to the MPEG standard, that is, adaptive quan- 
tization (AQ). This technique permits different regions of each picture to be coded with varying degrees of fidelity, and 
can be used in image and motion video compression to attempt to equalize (and optimize) the visual quality over each 
35 picture and from picture to picture. Although the MPEG standard allows adaptive quantization, algorithms which consist 
of rules for the use of AQ to improve visual quality are not prescribed in the standard. 

[0042] Another broad class of techniques that can be applied in an MPEG or similar encoder is generally referred to 
as pre-processing. Any sort of pre-processing of a digital video sequence which does not change the fundamental spa- 
tial relationship of the samples to one another may be incorporated into an MPEG-compatible encoder for the purpose 
40 of improving the visual quality of the compressed sequence. Examples of this include linear or nonlinear pre-filtering. 
[0043] Turning to the invention, a block diagram of an MPEG encoder incorporating three component subsystems for 
implementing the above-mentioned techniques in accordance with the present invention is shown in Figure 6. As seen 
in the Figure, to begin with, picture data P k representative of the k-th picture in a sequence enters one subsystem. QP- 
adaptive Pre-processor 3, where pre-processing may take place if appropriate. The nature of the pre-processing is con- 
45 trolled by quantization levels (QP prev ) of previously coded pictures, which will have been previously communicated to 
subsystem 3 from Adaptive-quantizing Rate-controlled (AQ/RC) Picture Coder 1. in the coarse of coding the data 
sequence. The possibly pre-processed picture data F k output by subsystem 3 enters the next subsystem, AQ/RC Pic- 
ture Coder 1 , where motion estimation and MB classification take place. Some of the results of these operations within 
the AQ/RC Picture Coder 1 (D k ) are passed to the remaining subsystem, Picture Bit Allocation subsystem 2, and a tar- 
so get number of bits for the picture data F k is passed back (A k , S k , and C k ) to the AQ/RC Picture Coder 1 . Coding then 
proceeds, as is described in more detail below. Ultimately, compressed data for picture data F k , CD k , is output from the 
AQ/RC Picture Coder 1 . Additionally, data relating to the number of bits required to code F k (B k ) and the reconstruction 
error (E k ) are passed to the Picture Bit Allocation subsystem 2, and the previous quantization level QP prev , which may 
be an average value, QP avg is passed to the QP-adaptive Pre-processor subsystem 3, for use in processing future 
55 frames. 

[0044] For purposes of operational descriptions of the three subsystems, the operation of the Picture-to-Picture Bit 
Allocation subsystem 2 will first be explained, followed by an explanation of the functioning of the AQ/RC Picture Coder 
subsystem 1 , and then the QP-adaptive Pre-processor subsystem 3 will be described. It may be helpful for a full under- 
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standing of the relationship of the invention to the MPEG video compression algorithm to refer to the afore-cited MPEG 
CD-11172 and to ISO-HEC JTC1/SC2/WG1 1 MPEG 91/74, MPEG Video Report Draft, 1991, or D. LeGALL, "MPEG: 
A Video Compression Standard for Multimedia Applica- tions," Communications of the ACM, vol. 34, April 1 991 . 

5 Picture to Picture Bit Allocation 

[0045] Video compression algorithms employ motion compensation to reduce the amount of data needed to represent 
each picture in a video sequence. Although fixed-bit-rate compression algorithms must maintain an overall average bit- 
rate near a specified target, they often have some latitude in the number of bits assigned to an individual picture. 

10 Assigning exactly the same number of bits to each picture produces a compressed sequence whose quality fluctuates 
with time, a phenomenon which is visually distracting to the viewer. The Picture Bit Allocation subsystem 2 involves pro- 
cedures for allocating bits among compressed pictures in a video sequence. It is applicable specifically to video com- 
pression algorithms intended to produce a fixed-bit-rate compressed data stream, and in which motion compensation 
is employed, e.g., the ISO/IEC MPEG video compression standard. 

w [0046] Ideally, a Picture Bit Allocation system would allocate a number of bits to each picture in such a way that the 
perceived visual quality of the coded sequence was uniform from picture to picture and equal to the optimum attainable 
at the given bit-rate, subject to bit allocation limitations imposed by the fixed-bit-rate rules. In general, such a system 
would require knowledge of the contents of the entire sequence prior to coding the first picture or frame. It would also 
require a priori knowledge of the visual quality that reconstructed pictures would have when coded using a given bit allo- 

20 cation. The first requirement is impractical because of the potentially large storage and delay implied. The second is 
currently very difficult because a mathematically tractable model of the perceived visual quality of coded visual data is 
not known, even when the coded and original pictures are available. 

[0047] The Picture Bit Allocation subsystem of the present invention provides a practical solution to this problem by 
keeping track of a measure of the difficulty in coding pictures of each type in the recent past. This measure, referred to 

25 as the coding difficulty, depends on the spatial complexity of a picture and the degree to which motion compensation is 
able to predict the contents of a picture. Bits are allocated to the three picture types in amounts dependent on the rela- 
tive coding difficulties of the three types. Additionally, the three allocations computed at each picture (one for each pic- 
ture type) are such that, if an entire Group of Pictures (GOP) were coded using those allocations, the number of bits 
required would equal the target bit-rate. 

30 [0048] Referring to Figure 6, the Picture Bit Allocation subsystem 2 determines how many bits to allocate to picture 
k after the data F k for that picture has been analyzed in the AQ/RC Picture Coder 1 , and the coding difficulty factor of 
the picture has been passed from the AQ/RC Picture Coder 1 to the Picture Bit Allocation subsystem 2, but prior to cod- 
ing the picture. The Picture Bit Allocation subsystem 2 also uses information pertaining to previously coded pictures, 
which the AQ/RC Picture Coder 1 is assumed to have already passed to the Picture Bit Allocation subsystem 2. Spe- 

35 cifically, this information consists of B k , the number of bits used to code the most recent picture of each type (broken 
into transform coefficient bits and side bits), and E n the reconstruction error of the most recent two anchor pictures. 
When estimating the number of bits to allocate to a particular picture, it is first necessary to select and consider a fixed 
number of consecutive pictures in the immediate future, i.e, a set of pictures in the sequence yet to be coded which 
comprises a fixed number of l-pictures (n,), P-pictures (n P ), and B-pictures (n B ). It is useful that the number and com- 

40 position of pictures in the set selected for consideration in this step be the same as those used for the picture bit allo- 
cation procedure that is performed from picture to picture in the sequence, but not necessary. What is necessary is that 
the average of the resulting picture bit allocations over time be equal to the target average picture bit allocation. 
[0049] The allocation operation about to be described begins by considering an allocation for the selected set of pic- 
tures, although the final result will be three picture bit allocations, one for each picture type, and only the picture bit allo- 

45 cation for the picture type corresponding to the type of the picture about to be coded will be used. Thus the process 
begins by computing a total bit allocation S set for the set of pictures which equals the average bit allocation consistent 
with the target bit rate: 

e set = (n l + n p + n B )x S avg . 

50 

where S avg is the average picture bit allocation consistent with the target bit rate. In the preferred embodiment, used as 
an example throughout this section of the description, the bits allocated to the set of pictures, and those allocated to 
each picture, fall into two classes: side bits (S) and coefficient bits (C). Here, S is taken to include all coded data other 
than coded transform coefficient data. By subtracting from the total bit allocation S set an estimate of the number of bits 
55 required to code side information in the set of pictures (S set ), a transform coefficient bit allocation for the set of pictures, 
C set is obtained. The number of bits allocated to coding the transform coefficients of the picture about to be coded will 
then be a fraction of C set , the size of which fraction will depend on the estimate of the coding difficulty associated with 
that picture. An exemplary technique for computing the allocation using the coding difficulty information will now be par- 
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ticularly described. 

Transform Coefficient and Side Information Allocation 

[0050] Side bits are assigned to include picture header information and all side information; for example, the motion 
compensation mode information, motion vectors, and adaptive quantization data. Coefficient information is contained 
only in the bits used to code the transform coefficients of the pixel data itself (in the case of l-pictures), or the pixel dif- 
ference data (in the P- and B-picture cases). Letting A h A P , and A B be the bit allocation for I-, P-, and B-pictures, 
respectively, A t = S, + C ; , A p = S p + C p , and A B = S B + C B (where S and C indicate side and coefficient bits, 
respectively). In the preferred embodiment, the side information bit allocation for the next picture to be coded is set 
equal to the actual number of bits required to code the side information in the most recent picture of the same type in 
the sequence. An alternative method of computing the side bit information allocation is to use an average of the actual 
numbers of bits required to code several or all past pictures of the same type in the sequence. It is also possible to 
ignore the side information allocations in this procedure, and to compute the picture bit allocation based solely on the 
transform coefficient bit allocation. This latter approach can be done, in the context of the following discussion, by 
assuming all side allocation variables S x are equal to 0. 

[0051] A means for computing the coding difficulty factor associated with a picture will be described below, but, in the 
meantime, for purposes of the description, it will be understood that once computed the coding difficulty factor for the 
most recent picture of each type is stored in the Picture Bit Allocation subsystem 2, and the following procedure is used 
to, compute the transform coefficient allocation for the current picture. First, the side information allocation for the set of 
pictures is estimated by (S set = n / S / + n p S p + n B S B ). This quantity is subtracted from the total number of bits allo- 
cated to the set, B set , yielding the set of pictures transform coefficient allocation: 

C set = B set - S set 

[0052] Then, C h C P , and C B are found as the unique solution to the equations: 
c set = n i c i + n P C p + n B C B , 



c - w Dr E - c 



Q = » v » — jjr t — Q 



[0053] The initial equation (for C get ) in this set ensures that the overall set average is correct. E' r is the average of the 
mean absolute errors of the past and future reconstructed anchor pictures, and the weighting terms w P and w B serve 
to de-emphasize the P- and B-picture allocation with respect to the others. Values of: w sub P = 1 .0 and w s = 0.5 are 
used in the preferred embodiment. Aside from these weights, the latter two equations (for C P and C B ) of the set allocate 
bits to P- and B-pictures proportional to the degree that their difficulty exceeds the mean absolute error in the (recon- 
structed) predicting picture(s). 

[0054] The foregoing method is valuable, because it accounts for the spatial complexity of the sequence through the 
three coding difficulty factors, D h D P , and D g , for the success of the motion compensation through D P and D B , the tar- 
get bit-rate through the requirement of the initial equation for C set and the quality of recently coded pictures through E r 
and E' r . 

[0055] Occasionally, the above bit allocation strategy results in an allocation that exceeds u VBV or falls below L VBV . 
The frequency with which this happens depends on the size of the VBV buffer and on the nature of the sequence. A 
typical scenario is when the VBV buffer is relatively small (e.g., six average pictures or less), and the motion compen- 
sation is very successful. In such a situation, the allocation strategy attempts to give virtually all of the transform bits for 



EP 0 540 961 B1 



a set to the l-pictures, resulting in an allocation for an individual picture larger than the VBV buffer size. In the preferred 
embodiment, when this happens, the l-picture allocation is chipped to fall a small amount inside the corresponding VBV 
limit, and the bits taken from the l-picture are re-allocated to the P-picture. This latter step is important, because if no 
explicit re-allocation is done, the average bit rate will drop. This will eventually result in VBV overflow problems, usually 
5 as L VBV begins to exceed the B-picture allocations. The net result of that is an implicit reallocation to B-pictures, which 
generally results in poorer overall picture quality. An additional benefit of the explicit P-picture re-allocation technique is 
more rapid convergence to extremely high picture quality in still scenes. In the case when a P-picture or B-picture allo- 
cation falls outside of the VBV bounds, no re-allocation of bits is done. 

[0056] Note that the allocation strategy can be applied to cases where there are no B-pictures simply by setting n g = 
10 0, and ignoring the equation which sets C B when computing allocations. It can similarly be applied to cases where no 
P-pictures exist. In addition, the distinction between coefficient and side information can be ignored, by using the coding 
difficulty estimate to allocate all the bits for a picture. In such a case, the coding difficulty estimate could factor in the 
difficulty of coding side information directly, or ignore side information completely. 

[0057] Two test sequences, the Flower Garden sequence and the Table Tennis sequence, used in the MPEG stand- 
15 ards effort were employed to test the effectiveness of the techniques of the invention. Specifically, a video sequence 
composed of the first 60 frames of the Flower Garden sequence, followed by the first 60 frames of the Table Tennis 
sequence, followed by 30 repetitions of the 61-st frame of Tennis Table (to simulate a still scene) will be used throughout 
this description to illustrate the methods. These sequences are 352 x 240 pixel YUV test sequences. The coding was 
done at 1.15 M bits/s with an l-picture spacing of N = 15 and an anchor picture spacing of M = 3. Figure 7 shows the 
20 coding difficulty factors for the entire sequence, and Figure 8 depicts the bit allocations computed for each picture. 

[0058] It should be noted that the three bit allocations shown for each picture in the sequence are those just prior to 
coding that picture, but that only one of these allocations is actually used. The target bit-rate resulting from the alloca- 
tion method is shown along with the actual bit-rates for the sequence in Figure 9. 

[0059] The stability at the scene change (frame 61 ) and the convergence of the actual bit-rate for P- and B-pictures 
25 to nearly zero will be noted in the still segment (frames 121-151). The quantization factors (QP) used to code the 
sequence are plotted in Figure 10. Note also that I- and P-pictures are generally coded with a finer step size than B- 
pictures. 

AQ/RC Picture Coder 

30 

[0060] Turning now to the AQ/RC Picture Coder 1, this subsystem involves procedures for the adaptive quantization 
(AQ) of the successive pictures of a video sequence to achieve improved visual quality, while ensuring that the number 
of bits used to code each picture is close to a predetermined target. Procedures are performed for l-pictures, P-pictures, 
and B-pictures. These procedures involve treating the spatial regions making up a picture using a region classification 
35 strategy which works in tandem with: 

motion estimation; 

an adaptive model of the number of bits required to code a picture region as a function of the quantization factor 
QP and measured characteristics of the region; and, 
40 a scheme for adapting the quantization level as a picture is coded to ensure that the overall number of bits pro- 

duced is close to the predetermined target. 

Although, for purposes of description here, the spatial regions will be treated as MPEG macroblocks (MB), it should be 
understood that the procedures described may be applied to regions of different sizes and shapes. 

45 [0061] Figure 1 1 generally illustrates the components of the AQ/RC Picture Coder 1 . The operation of this subsystem 
depends on the type of picture being coded. As seen in the Figure, a video picture signal F k , for a picture k, which may 
or may not have been pre-processed in the QP-adaptive Pre-processor 3, enters a Motion Estimation and MB 
Classification unit 14 of the AQ/RC Picture Coder 1. There, the signal is analyzed and each MB is classified according 
to procedures described below. If the picture is a P-picture or B-picture, motion estimation is also performed. Results of 

so these operations in the form of a coding difficulty factor, D k , are passed to the Picture Bit Allocation subsystem 2, for 
use as detailed above. The Picture Bit Allocation subsystem 2 then returns a bit allocation signal C k for picture k. This 
bit allocation signal is used by a QP-level Set unit 1 5, along with a set of information passed from the Motion Estimation 
and MB Classification unit 14, to determine initial values of the quantization factor QP to be used in coding each MB. 
Additionally, the QP-level Set unit 15 computes an estimate of the number of bits required to code each row of MB's in 

55 the picture. These quantization factors and row targets are passed to the Rate-controlled Picture Coder unit 1 6, which 
proceeds to code the picture, also using information passed from the Motion Estimation and MB Classification unit 14. 
Since the operation of the AQ/RC Picture Coder 1 is partitioned among three sub-units the description that follows will 
follow the same partition while referring primarily to Figure 1 1 . 
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Motion Estimation and MB Classification Unit 

[0062] One of the primary purposes of the Motion Estimation and MB Classification unit 14 is to determine which cod- 
ing mode m(r,c) will be used to code each MB in a picture. This function is only used for motion compensated pictures, 

5 since there is only one mode for MB's in l-pictures: intramode. The mode decision relies on a motion estimation proc- 
ess, which also produces motion vectors and motion-compensated difference MB's. Another important function of the 
Motion Estimation and MB Classification unit 14 is to classify each MB. The class cl(r,c) of MB (r,c) will ultimately deter- 
mine the value of the quantization factor QP(r,c) used to code that MB. The modes and classes are determined by ana- 
lyzing each picture, and estimating the motion between the picture to be coded and the predicting picture(s). The same 

w information is also used to compute the coding difficulty factor, D k , which is passed to the Picture Bit Allocation subsys- 
tem 2. 

[0063] The objective of motion estimation in the MPEG video coding algorithm is to obtain a motion vector 
mv(r,c)=(r mv ,C mv ) and the associated motion-compensated difference MB M k (r,c). The motion-compensated differ- 
ence MB is the pixel-wise difference between the current MB under consideration and the predicting MB. The exact 

15 method for forming the prediction MB depends on the motion compensation mode employed, and is detailed in the the 
above-noted ISO-IEC JTC1/SC2/WG1 1 MPEG CD-11172, MPEG Committee Draft, 1991. The motion vector should, 
in some sense, be indicative of the true motion of the part of the picture with which it is associated. Details of motion 
estimation techniques can be found in A. N. NETRAVALI AND B. G. HASKELL, Digital Pictures: Representation and 
Compression New York, NY: Plenum Press, 1988. 

20 [0064] For purposes of the present description, it will be assumed that a full search motion estimation algorithm was 
used covering a range of ±7 x n pixels in the horizontal and vertical directions, where n is the distance in picture inter- 
vals between the picture being analyzed and the predicting picture, and where the motion vectors are accurate to half 
a pixel. The present invention involves techniques for using the results of motion estimation to code video sequences, 
but is not limited to use with any specific motion estimation techniques, and can be used with any motion estimation 

25 method, provided that a measure of the success of motion compensation (motion compensation error), that indicates 
how good the match is between the MB being compensated and the predicting region pointed to by the motion vector, 
can be made available. It will be recalled that for P-pictures, there is one type of motion estimation (forward-in-time), 
and for B-pictures there are three types (forward-in-time, backward-in-time, and interpolative-in-time). The forward 
motion vector for MB (r,c) may be denoted as mv f (r,c), and the backward motion vector as mv b (r,c). The interpolative 

30 mode uses both forward and backward vectors. The forward, backward, and interpolative motion compensation errors 
may be denoted as A mcl (r,c), A mcb {r,c), and A mc ,{r,c), respectively. 

[0065] In addition to the motion compensation error(s), a measure of the spatial complexity of each MB is needed. 
Denote this measure as A(r,c). It is important that A(r,c). A mc ^r,c), A mc b (r,c), and A mc ( (r,c), are like measures, in the 
sense that numerical comparision of them is meaningful. In the preferred embodiment, these measures are all defined 
35 to be mean absolute quantities, as indicated below. Labeling each MB by its row and column coordinates (r.c), denotes 

the luminance values of the four 8x8 blocks in MB (r,c) by y k (i,j), i=0,...,7, j=0 7, k=0 3 and the average value of 

each 8x8 block by dc k . Then, the spatial complexity measure for MB (r,c) is taken to be the mean absolute difference 
from DC, and is given by 

3 

A(r,c) = l^A t (r,c) ) 
fc-0 

45 

where 



55 [0066] The like motion compensation error is the mean absolute error. Denoting the four 8x8 blocks in the predicting 
MB by p k (i,j), /=0,...,7, j=0,...,7, k=0,...,3, this is defined by 
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Amc(r,c) 
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[0067] In the preferred embodiment of the invention, the coding difficulty factors passed to the Picture Bit Allocation 
subsystem 2 are based completely on the above measures of spatial complexity and motion compensation error. For I- 
pictures, the total difficulty factor is 



20 [0068] For P-pictures and B-pictures, the coding mode is first decided upon, and the measure associated with that 
mode is used in a summation similar to the one above. The following modes being possible: 

intramode: m(r,c)=\ , 

forward mc: m(r,c)=mc f , 

25 backward mc: m(r,c)=mc b . 

interpolative mc: m(r,c)=mo ( , 



the difficulty factors are computed by 



O r - Z A(r ' c > + Z <W r - c ). 

m<r,e) = I m(rs) - mc f 



»B - Z ^ r ' r ) + Z <W r .*> + Z ^W r ' c > + Z ^Arx). 

- I m(r /:) - mc/ m{r^-) - mc, m(r.r) -mc, 



[0069] Many possible rules can be used to decide which mode to employ. In the preferred embodiment, the following 
rule is used for P-pictures. 



f I A(r,c)< 



/?A mc /r. C ). 



55 [0070] A value of p = 1.0 is used. In the preferred embodiment, the mode selection rule used for B-pictures is: the 
mode with the lowest A(r,c) is used to code the MB. It is to be appreciated that, although mean absolute quantities were 
used as the measures of coding difficulty in the preferred embodiment, any like measures (for example, mean square 
quantities) could also be used. 
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[0071] It is intended that the measures used to determine MB modes and compute coding difficulties could be by- 
products of the motion estimation procedure. This is possible, in part, because the measures described above are often 
used to find the best motion vector in motion estimation procedures. 

[0072] These measures are also used to classify macroblocks. In the preferred embodiment, the MB's are classified 
5 as follows. The class of all intramode MB's is computed by quantizing the minimum value of A k (r,c) for that MB. Defining 
a threshold f, the class cl(r,c) of MB (r,c) is given by 



nun[A A (r,c)] 
c/(r.c) = - 



[0073] After a motion compensation mode has been chosen for motion compensated MB's, they are classified accord- 
15 ing to: 



min[ min[A > (r,c)3, A mc (r.c): 
cl(r.c) * 



[0074] A value of t = 2 is used in the preferred embodiment. Note that both intramode and motion compensated meas- 
25 ures are used to classify motion compensated MB's. The mode and class information is used, along with the underlying 
measures, by the QP-level Set unit 15 to determine an initial quantization level, and by the RC Picture Coder unit 16 
during coding. 

[0075] Typical class distributions for I- and P-pictures taken from both the Flower Garden and Table Tennis segments 
of the sequence are shown in Figure 12. 
30 [0076] To keep computational complexity low in the preferred embodiment, B-picture MB's are not classified, the Q- 
level Set unit 15 is not used, and the coding scheme employed in the RC Picture Coder unit 16 is simpler than that used 
for l-pictures and P-pictures. 

QP-Level Set Unit 

35 

[0077] The function of the QP-level Set unit 1 5 is to compute an initial value for the quantizer step size for each class. 
All MB's in a given class are assigned the same quantization step size. In the preferred embodiment, the quantization 
step size for each class relative to an overall minimum step size is assigned according to: 

40 QP(r, c) = QP low + A QP x cl(r, c). 

[0078] Values of AQP that have been used in the preferred embodiment are 5 and 6. Note that the allowed range for 
QPi ow in the preferred embodiment is - 31 31 , although MPEG only allows for integer values of QP(r,c) in the range 
of 1 31 . Therefore, whenever the above formula produces a value above 31 , it is clipped to 31 , and any values which 
45 fall below 1 are clipped to 1 . It is beneficial to allow QPj ow to be less than 1 to ensure that the finest quantizer step sizes 
can be applied to MB's of all classes, if the bit-rate warrants it. The process for selecting the initial value QP'ffl of QP/ ow 
is explained below. 

[0079] The underlying model of human perception of coding errors used in the preferred embodiment, as reflected in 
the method for computing the class cl(r,c) of each MB and for computing QP(r,c), given cl(r,c), is that like-magnitude 
so errors are more visible in less active regions of a picture. While this model is clearly an over-simplification, it is a rea- 
sonable compromise between visual quality and computational burden. The rationale behind using the minimum A k 
over the four luminance blocks in the MB for classification, rather than the A of the entire block, is that MB's with any 
smooth regions should be assigned a low quantizer step size. 

[0080] The MB modes m(r,c) and classes cl(r,c) are used along with the A(r,c) and A mc (r,c) values and the target 
55 bit-rate for the picture transform coefficients to set the initial quantizer low value QPj 0W . A model has been developed in 
accordance with the invention which predicts the number of bits required to code the transform coefficients of an MB, 
given the quantization value to be used and A (in the case of intramode MB's) or A mc (for motion-compensated MB's). 
Experimental data leads to a model of the form: 



13 



EP 0 540 961 B1 



WQP,r,c) = a l A(r,c)Qr h ' 



for intramode MB's and 



D mc {QP,r,c) = a r k m Jir,c)Ql> h '- 



for motion-compensated MB's. The exponents are £>,=-0.75 and 6 P =-1.50. However, these values depend strongly on 
the particular quantization weighting values w mn being used, and should be optimized to match them. 
[0081] To estimate appropriate values for the a and b parameters, the following experimental approach has been 
taken. Consider the case of the l-picture model, for which it is desired to estimate a ; and b t . Because the parameters of 
the model to track changes from picture to picture are to be adapted, the primary interest will be the model's accuracy 
relative to an individual picture, rather than an ensemble of pictures. Accordingly, a representative picture is encoded 
several times, using a different value of the QP quantizer step size for each pass. The number of bits required to code 
each MB at each value of QP is measured. Next, for each value of QP, the number of bits required to code all MB's 
having a given value of A is averaged. The result is a two-dimensional data set which indicates the average number of 
bits required to code MB's as a function of the A value of the MB and the OP step size used to code it. These average 
values may be denoted as 



[0082] This is an overdetermined set of nonlinear equations in a; and b/, and can be solved using nonlinear least 
squares methods. In order to linearize the problem logarithms of both sides of the equation are taken. This results in an 
easily solved linear least squares problem in log(a/) and b\. 

[0083] The linear parameters a/ and a P should be adjusted after coding each I- or P-picture, to track the dynamically 
changing characteristics of the video sequence. This can be done according to a method which will be described in 
detail in the description of the RC Picture Coder unit 16 below. (For intramode MB's, this model can be improved by add- 
ing an additional term to account for the number of bits required to code the DC terms in the MB, since the coding for 
DC coefficients is handled separately.) 

[0084] The predicted number of bits required to code the transform coefficients for the entire picture according to 
these bit-rate models is 



It is desired to fit these measured values to an equation of the form: 



r.c3 



for l-pictures and 



flfi'/W - £ B,iQP{r,c),r,c} + £ 



B m ,!LQP{r.c),r,c1 




for P-pictures, where QP(r,c) is computed according to 
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QP(r,c) = QP low + AQP x cl(r,c). 



[0085] The initial value for QP l0W is taken as that value of QP for which B(QP) is closest to the picture transform coef- 
ficient allocation C: 



[0086] In the preferred embodiment, a half-interval search is conducted between - 31 and 31 to find QP'£jl ■ The role 
of the upper and lower bounds on QP in this procedure is subtle. While an upper bound of 31 is sufficient to guarantee 
that the encoder can operate with the coarsest possible quantization allowed by the standard, a larger upper bound will 
15 change the performance of the rate control algorithm, as will be described below in greater detail, by making it more 
sensitive to the over-production of bits. Similar properties hold for the lower bound on QP. 

[0087] Once QP tow has been determined, the QP-level Set unit 15 computes the expected number of bits required to 
code row r of MB's using QPj OW , by 



25 

where N row is the number of rows of MB's. The second term in this expression accounts for the difference between the 
number of bits predicted by the model at QP low and the actual transform coefficient allocation C, and the third term 
accounts for each row's share of the side information allocation S. The sum of the targets T(r) over all the rows yields 
the total picture allocation A. These expected values become target row bit-rates for the RC Picture Coder unit 1 6. 

30 

Rate-controlled Picture Coder 

[0088] Picture coding proceeds by indexing through the MB's and coding each according to the mode and quantizer 
step sizes determined in the previous steps. However, because of mismatches in the bit-rate model and the continual 

35 changing of the contents of a sequence, the actual number of bits produced will not exactly match the expected number. 
It is desired to control this deviation, not only to keep the actual bits produced for the picture close to the target, but also 
to prevent violation of the VBV bit-rate limitations. A rate control feedback strategy has been developed in accordance 
with the invention which updates QP/ 0W at the end of each row of MB's. A number of factors determine the update. One 
factor is that different rows of MB's in a picture are not expected to produce the same number of bits, because of vari- 

40 ations in A(r,c) and A mr (r,c), as well as assigned quantizer step sizes. At the end of each row, the number of bits pro- 
duced is compared to the expected number 7(r) computed in the QP-level Set unit 1 5. Another factor which plays a role 
in updating QPi 0W is the closeness of both the picture allocation and the actual number of bits produced to the VBV limit. 
The gain of the QP iom update as a function of bit-rate deviations is a function of the proximity of the VBV limit in the 
direction of the error. Minor deviations from the predicted bit-rate cause little or no change in QPj 0W , while deviations 

45 which bring the picture bit-rate close to one or the other of the VBV limits cause the maximum possible adjustment in 
QP/ ow . Such a strategy is quite successful in preventing VBV violations, hence, avoiding undesirable actions like the 
dropping of coded data or the stuffing of junk bytes into the bit stream. 

[0089] The following equations describe the update procedure for QP/ 0W , as implemented in the preferred embodi- 
ment. Denoting the total number of bits used to code row m and all preceding rows by B(m), and the difference between 
so B(m) and the cumulative target as AB(m): 



5 



QP%1 = argmin|/?«2/ > ) - Ot|. 



10 



20 





r= 1 



55 



[0090] After coding row m, QPj OW is updated if AB(m)*-0 as follows: 
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| QthZ + QF *£ 31 AB(m) AB(m)<0, 



where Au and A/ are the differences between the picture allocation A and the upper and lower VBV limits for picture n, 
respectively: 



15 A/ = max(0,L )-A. 

[0091] This strategy updates QP i0W based on the total bit allocation error up to the current row, as it relates to the 
maximum error allowed according to the VBV criterion. 

[0092] After each I- or P-picture is coded, new bit-rate model parameters (a, and a P ) are computed so that the bit- 
20 rate model will agree with the number of transform coefficient bits actually produced (C a ). To illustrate this for the l-pic- 
ture case, during the course of coding each picture, the sum of all A(r,c) for MB's coded with each value of QP is gen- 



30 [0093] An updated value of a, is computed by 



a, = (1 - a)a, + oca',. 

[0094] A value of a = 0.667 may be used in the implementation. A similar strategy is used to update both a, and a P 
after coding a P-picture. In that case, a is proportional to the fraction of MB's coded in the mode corresponding to the 
bit-rate model parameter being updated. 

[0095] Finally, the number of bits used to code all side information for the picture is stored for use as the value of the 
side information allocation S for the next picture of the same type. 

[0096] The performance of the QP assignment and update strategies is depicted in Figures 13 and 14. Figure 13 
shows the QPj 0W and average QP in each row of frames 16, 22, 61, and 67 of the test sequence. It should be under- 
stood that, if the initial guess for QPj 0W and the bit-rate models were exact, there would never be any change in QP/ 0W 
from row to row. However, QP avg would fluctuate depending on the spatial activity and motion compensability of the dif- 
ferent rows in the pictures. For instance, it can easily be seen, from the l-picture QP values, that the lower half of the 
rows of the Flower Garden segment is far more complex spatially than the upper half. The P-picture results show that 
motion compensation reduces the variation in QP avg . and Figure 14 shows the bits produced versus the targets on a 
row by row basis. The results can be seen to track the targets reasonably well. 

[0097] The rate control method for B-pictures differs from that of I- and P-pictures. No MB classification has been 
done, and hence no attempt is made to estimate the amount of compressed data each row of MB's will produce. Thus 
all row targets in a picture are the same. At the start of each picture, the quantizer factor is set equal to the value it had 
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at the end of the previous B-picture. After each row of MB's, QP is updated in much the same fashion as for the other 
picture types, but with the upper and lower bounds determined by 

Au = max(U VBV - A, A), 

5 

Al = max(0,L VBV ) - A. 

[0098] The foregoing presents a motion video coder procedure which uses adaptive bit allocation and quantization to 
provide robust, high quality coded sequences over a range of source material and bit-rates. The coded data adheres to 
10 the fixed-bit-rate requirements of the ISO/I EC MPEG video coding standard. The additional coder complexity required 
to implement the adaptive techniques is modest with respect to the basic operations of motion estimation, discrete 
cosine transforms, quantization, and Huffman coding, which are part of a basic coder. These features make the algo- 
rithm suitable for flexible, real-time video codec implementations. 

15 Adaptive Pre-Processing of Video Sequences 

[0099] The operation of the QP-adaptive Pre-processor 3 of the invention is based on the observation that, under cer- 
tain conditions, more visually pleasing images are produced by low bit-rate coders when the input pictures have been 
pre-processed to attenuate high-frequency information and/or to remove noise, which is inefficient to code, but visually 

20 less significant than low-frequency noise-free information. Specifically, when sequences contain regions of non-negligi- 
ble size which are spatially very complex, or if noise has been introduced for some reason, an inordinate number of bits 
is required to represent the high-detail regions and noise accurately, leading to an overall degradation in visual quality. 
This degradation often takes the form of visually distracting, flickering noise-like artifacts. It is often a good trade-off to 
reduce the high-frequency content by pre-processing such as linear or non-linear filtering, which makes the images look 

25 less like the original, but which allows for better rendition of the low-frequency information without distracting artifacts. 
On the other hand, many sequences are such that the visual quality at low bit-rates is quite acceptable without any need 
to reduce the high-frequency information and noise. In cases such as this, pre-processing introduces degradations 
unnecessarily. Thus, it is desirable to be able to pre-process or not to pre-process, depending on the need. 
[0100] One important indicator of the need for pre-processing is the quantization level required to code the sequence 

30 at the target bit-rate. The main advantage of using information about the quantization factor to control the amount of 
pre-processing is that it is independent of the bit-rate. Generally speaking, if the quantization level is very high (implying 
coarse quantization and hence poor quality reconstruction) much of the time, the reason is that the scene is too com- 
plex to code accurately at the target bit-rate. 

[0101] The general operation of the third subsystem of the invention will be described with reference to Figure 6 alone 
35 with reference to the components of the QP-Adaptive Pre-processor 3 generally shown in Figure 15 and a preferred 
operational embodiment shown in Figure 1 6. As described above in connection with the operation of the AQ/RC Picture 
Coder 1 , as each picture is coded, a previous quantization factor, QP pre y used to quantize the transform coefficients is 
computed. This quantization level can depend on many things, including the number of MB's of each type in the picture, 
the number of bits allocated to the picture by the Picture Bit Allocation subsystem 2, and the overall complexity of the 
40 picture. The average QP used to code a picture is often a good quantity to use for QP prev . After coding of each picture 
is complete, QP prev is passed to the QP-Adaptive Pre-processor 3 from the AQ/RC Coder 1 . Based on the values of 
QP prev from possibly more than one prevous picture, one of several pre-processors is selected to be applied to all pic- 
tures starting at some point in the future, and continuing until a new value of QP prev from a later picture causes another 
change in the pre-processor. As seen in Figure 15, the QP prev signal is received in an Implementation Lag Buffer 31 
45 and passed to a Pre-processor Algorithm Selector unit 32 which controls switching of the signal to a Pre-processor unit 
33. 

[0102] The Pre-processor unit 33 can consist of a set of filter, Filter 1, Filter 2, Filter n. One preferred implementa- 
tion of Pre-processor unit 33 is shown in Figure 16 wherein the preprocessor filters are purely linear, and there are three 
possible filter states (FS). 

50 

1. FS = 0 No filter. 

2. FS = 1 Separable 3 tap FIR filter with coefficients 

3. FS = 2 Separable 3 tap FIR filter with coefficients (1 

55 [0103] One algorithm useful for updating the filter states under the control of units 31 and 32 is as follows: 
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[ min(2, /•■£ + 1) irQ m ^>'l\. 
| maxfO, PS - I) \tQ avz < T 7 . 



[0104] The filter state update takes place only after l-pictures, and the new state does not go into effect until the next 
10 l-picture (this delay is referred to as implementation lag). Useful values of 7~ ? and T sub 2 are 10 and 5, respectively. 
[0105] The particular choices of filters, filter states, filter state update rule, and implementation lag described above 
represent but one of many possibilities within the scope of the invention. It is contemplated that there can be an arbitrary 
number of filters, and they can be nonlinear or spatially adaptive. Another important variation is to perform the filter state 
update more frequently, and to simultaneously reduce the implementation lag. For example, the filter state update can 
15 take place after every P-picture, with the implementation lag reduced to the delay between P-pictures. 

Claims 

1 . A method for the allocation of bits to be used to compression code digital data signals representing a set or sets of 
20 pictures in a motion video sequence, in the form of an M-PEG II data stream comprising the steps of: 

identifying each picture in said set or sets to be compression coded as one of three types I, P, B; 

determining the total number of bits (B sat )to be used in compression coding each set of pictures based on a 
25 fixed target bit rate for each sequence wherein said total bit number (B set ) equals an average bit allocation 

(B avg ) multiplied by the total number of l-pictures (n,), P-pictures (n P ) and B-pictures (n B ) occurring in said set 
so that B set = (n , + n p + n B ) • B avg ; and, after preferably subtracting from said total number of bits (B set ) an 
estimate (S set ) of the number of bits required to code side information in each set resulting in a modified total 
number of bits (C set ), allocating from said total number of bits (C set or B set ), bits for use in compression coding 
30 a picture in each set by determining the allocations (C|, C p C B ) for each picture type in the set prior to com- 

pression coding each picture by using 

1 ) the degree of difficulty (Dj, Dp D D ) of compression coding each picture type, said degree of difficulty cor- 
responding to the spatial complexity of a respective picture such that the bit allocation (C|, Cp Cq) for a 

35 respective picture type increases or decreases with an increasing or decreasing degree of difficulty 

(D|,D P D B ) of compression coding said respective picture type, and 

2) said total numbers (N|, N p N B ) of each of the three picture types in each said set to produce allocations 
which meet said fixed target bit-rate. 

40 2. A method as in claim 1 , wherein the allocating step is performed in accordance with the following equations 

C set = n , C l + n p C p + n B C B ; 



45 

Dp - E' r 
Cp = W P Cl ; and 

50 




where W P and W B are weighting terms to deemphasize the P- and the B-picture bit allocation with respect to the 
other ones, and E r ' is an average motion compensation error produced in a motion compensated picture recon- 
struction. 
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3. A method as in claim 1 wherein the allocation of bits for each picture type is proportional to the degree of difficulty 
in compression coding that picture type. 

4. A method as in claim 1 wherein the degree of difficulty of compression coding each picture type is determined using 
5 the difficulty of compression coding the pixel data of the picture about to be coded and at least one picture of said 

set already coded. 

5. A method as in claim 1 wherein the degree of difficulty of compression coding each picture type is determined using 
the difficulty of compression coding the pixel difference data of the picture to be coded and at least one picture of 

w said set already coded. 

6. A system for the allocation of bits to be used to compression code digital data signals representing in a set or sets 
of pictures in a motion video sequence in the form of an M-PEG II data stream comprising: 

15 means for identifying each picture in said set or sets to be compression coded as one of three types I, P, B; 

means (1-3) for determining the total number of bits (B set ) to be used in compression coding each set of pic- 
tures based on a fixed target bit rate for each sequence wherein said total bit number (B set ) equals an average 
bit allocation (B avg ) multiplied by the total number of l-pictures (n|), P-pictures (n P ) and B-pictures (n B ) occur- 
20 ring in said set so that B set = (n | + n p + n B ) • B avg ; means for preferably subtracting from said total number 

of bits (B set ) an estimate (S set ) of the number of bits required to code side information in each set resulting in 
a modified total number of bits (C set ); 

means (2) for allocating from said total number of bits (C S9t or B set ), bits for use in compression coding a picture 
25 in each set by determining the allocations (C h C p C B ) for each picture type in the set prior to compression cod- 

ing each picture by using 

1 ) the degree of difficulty (D h D p D B ) of compression coding each picture type, said degree of difficulty cor- 
responding to the spatial complexity of a respective picture such that the bit allocation (C|, C p C B ) for a 

30 respective picture type increases or decreases with an increasing or decreasing degree of difficulty 

(D|,Dp,D B ) of compression coding said respective picture type and 

2) said total numbers (N|, N p N B ) of each of the three picture types in each said set to produce allocations 
which meet said fixed target bit-rate. 

35 7. A system as in claim 6, wherein said allocating means (2) is performed in accordance with the following equations 

C Sfl , = n,C, + n p C p + n R C R ; 



C! ; and 



C B = W B - r Cl/ 



where W P and W B are weighting termo to deemphasize the P- and the B-picture bit allocation with respect to the 
other ones, and E r ' is an average motion compensation error produced in a motion compensated picture recon- 
struction. 
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Patentanspruche 

1. Ein Verfahren zum Zuordnen von Bits, die zur Komprimierungscodierung digitaler Datensignale, welche eine Oder 
mehrere Gruppen von Bildern in einer bewegten Videosequenz darstellen, benutzt werden, in Form eines M-PEG 
ll-Datenstroms, bestehend aus folgenden Schritten: 

Identifizieren jedes Bildes der Gruppe bzw. Gruppen, die einer Komprimierungscodierung unterzogen werden 
soil, als zugehorig zu einem der drei Typen I, P, B; 

Ermitteln der Gesamtanzahl der Bits (B set ), die bei der Komprimierungscodierung jeder Bildergruppe verwen- 
det werden soil, anhand einer festgelegten Soll-Bitrate fur jede Sequenz, wobei die Gesamtanzahl der Bits 
(B se t) gleich einer durchschnittlichen Bitzuordung (B avg ) multipliziert mit der Gesamtanzahl der l-Bilder (n|), P- 
Bilder (n P ) und B-Bilder (n B ) in der Gruppe ist, so daB B sel = (n , + n p + n B ) • B avg ist; 

und, nachdem vorzugsweise von der Gesamtanzahl der Bits (B set ) ein Schatzwert (S set ) fur die Anzahl der 
zum Codieren der Nebeninformation in jeder Gruppe benotigten Bits subtrahiert worden ist, woraus eine modi- 
fizierte Gesamtanzahl der Bits (C set ) resultiert, von der Gesamtanzahl der Bits (C set Oder B set ) Bits zur Verwen- 
dung bei der Komprimierungscodierung eines Bildes in jeder Gruppe zugeordnet werden, indem vor der 
Komprimierungscodierung der einzelnen Bilder Zuordnungen (C,, C p C B ) fur jeden Bildtyp in der Gruppe 
bestimmt werden, unter Verwendung 

1 ) des Schwierigkeitsgrades (D|, Dp D B ) der Komprimierungscodierung jedes Bildtyps, wobei der Schwie- 
rigkeitsgrad der raumlichen Komplexitat eines betreffenden Bildes entspricht, so dal3 die Bitzuordnung (C|, 
Cp C B ) fur einen betreffenden Bildtyp mit zunehmendem bzw. abnehmendem Schwierigkeitsgrad (D|, Dp 
D B ) der Komprimierungscodierung zu- bzw. abnimmt, und 

2) der Gesamtanzahlen (N,, N p N B ) jedes der drei Bildtypen in jeder Gruppe zur Erzeugung von Zuord- 
nungen, die die teste Soll-Bitrate erfullen. 

2. Ein Verfahren nach Anspruch 1 , bei dem der Zuordnungsschritt nach folgenden Gleichungen ausgefuhrt wird 

C sel = n l C I + n P C P + n B C B : 



D c - E' 

C p = W p ^ e-c,; and 



D„ - E' 

C B = W B S— E. C lf 

wobei W P und W B Gewichtungsfaktoren sind, urn die Bitzuordnungen von P- und B-Bildern im Vergleich zu den 
anderen weniger stark zu betonen, und E' r ein durchschnittlicher Bewegungskompensationsfehler ist, der bei der 
bewegungskompensierten Bildrekonstruktion auftritt. 

3. Ein Verfahren nach Anspruch 1, bei dem die Bitzuordnung fur jeden Bildtyp proportional zum Schwierigkeitsgrad 
der Komprimierungscodierung des betreffenden Bildtyps ist. 

4. Ein Verfahren nach Anspruch 1, bei dem der Schwierigkeitsgrad der Komprimierungscodierung jedes Bildtyps 
anhand der Schwierigkeit der Komprimierungscodierung der Pixeldaten des zu codierenden Bildes und minde- 
stens eines bereits codierten Bildes aus der Gruppe ermittelt wird. 

5. Ein Verfahren nach Anspruch 1, bei dem der Schwierigkeitsgrad der Komprimierungscodierung jedes Bildtyps 
anhand der Schwierigkeit der Komprimierungscodierung der Pixeldifferenzdaten des zu codierenden Bildes und 
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mindestens eines bereits codierten Bildes aus der Gruppe ermittelt wird. 

6. Ein System zur Zuordnung von Bits, die zur Komprimierungscodierung digitaler Datensignale, welche eine Oder 
mehrere Gruppen von Bildern in einer bewegten Videosequenz darstellen, benutzt werden, in Form eines M-PEG 
ll-Datenstroms, bestehend aus: 

Mitteln zum Identifizieren jedes Bildes der Gruppe bzw. Gruppen, die einer Komprimierungscodierung unter- 
zogen werden soil, als zugehorig zu einem derdrei Typen I, P, B; 

Mitteln 1 - 3 zur Ermittlung der Gesamtanzahl der Bits (B set ), die bei der Komprimierungscodierung jeder Bil- 
dergruppe verwendet werden soil, anhand einer festgelegten Soll-Bitrate fur jede Sequenz, wobei die Gesamt- 
anzahl der Bits (B S9t ) gleich einer durchschnittlichen Bitzuordung (B avg ) multipliziert mit der Gesamtanzahl der 
l-Bilder (n,), P-Bilder (n P ) und B-Bilder (n B ) in der Gruppe ist, sodal3 B set = (n , + n p + n B ) • B avg ist; Mitteln, 
urn vorzugsweise von der Gesamtanzahl der Bits (B set ) einen Schatzwert (S set ) fur die Anzahl der zum Codie- 
ren der Nebeninformation in jeder Gruppe benotigten Bits zu subtrahieren, woraus eine modifizierte Gesamt- 
anzahl der Bits (C set ) resultiert; 

und Mitteln 2, urn aus der Gesamtanzahl der Bits (C set Oder B set ) Bits zur Verwendung bei der Komprimie- 
rungscodierung eines Bildes in jeder Gruppe zuzuordnen, indem vor der Komprimierungscodierung der einzel- 
nen Bilder Zuordnungen (C h C p C B ) fur jeden Bildtyp in der Gruppe bestimmt werden, unter Verwendung 

1) des Schwierigkeitsgrades (D|, Dp D B ) der Komprimierungscodierung jedes Bildtyps, wobei der Schwie- 
rigkeitsgrad der raumlichen Komplexitat eines betreffenden Bildes entspricht, so dal3 die Bitzuordnung (C|, 
C p C B ) fur einen betreffenden Bildtyp mit zunehmendem bzw. abnehmendem Schwierigkeitsgrad (D|, Dp 
D B ) der Komprimierungscodierung zu- bzw. abnimmt, und 

2) der Gesamtanzahlen (N,, N p N B ) jedes der drei Bildtypen in jeder Gruppe zur Erzeugung von Zuord- 
nungen, die die teste Soll-Bitrate erfullen. 

7. Ein System nach Anspruch 6, bei dem das Zuordnungsmittel 2 folgende Gleichungen erfullt: 

C set = n l C l + n p C p + n B C B' 



D„ - E' 
C p = W p - L_ ^ c i' and 



D B - E' r 



wobei W P und W B Gewichtungsfaktoren sind, urn die Bitzuordnungen von P- und B-Bildern im Vergleich zu den 
anderen weniger stark zu betonen, und E' r ein durchschnittlicher Bewegungskompensationsfehler ist, der bei der 
bewegungskompensierten Bildrekonstruktion auftritt. 

Revendications 

1. Procede pour I'attribution de bits qui doivent etre utilises pour la compression de signaux de donnees numeriques 
codees representant un ensemble ou des ensembles d'images dans une sequence de video animee, sous la forme 
d'un train de donnees M-PEG II comprenant les etapes consistant a : 

identifier chaque image dans ledit ensemble ou lesdits ensembles qui doit etre codee par compression comme 
un des trois types I, P, B ; 
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determiner le nombre total de bits (B set ) qui doivent etre utilises par le codage par compression de chaque 
ensemble d'images sur la base d'un debit binaire cible fixe dans chaque sequence dans lequel ledit nombre 
total de bits (B set ) est egal a une attribution de bit moyenne (B avg ) multipliee par le nombre total d'images I (ri|), 
d'images P (n P ) et d'images B (n B ) se produisant dans ledit ensemble de sorte que 

B set = ( n I + n P + n b) B avg '• 

et, apres avoir de preference soustrait dudit nombre total de bits (B set ) une estimee (S set ) du nombre de bits 
requis pour coder les informations secondaires dans chaque ensemble ayant pour resultat un nombre total 
modifie de bits (C set ), attribuer a partir dudit nombre total de bits (C set ou B set ), des bits pour utilisation dans le 
codage par compression d'une image dans chaque ensemble en determinant les attributions (C|, C p C B ) pour 
chaque type d'image dans I'ensemble avant le codage par compression de chaque image en utilisant 

1) le degre de difficulty (D|, D p D B ) du codage par compression de chaque type d'image, ledit degre de 
difficulte correspondant a la complexite spatiale d'une image respective d'une maniere telle que I'attribu- 
tion de bits (C h C p C B ) pour un type d'image respectif augmente ou diminue avec un degre croissant ou 
decroissant de difficulte (D,, D p D B ) du codage par compression dudit type d'image respectif, et 

2) lesdits nombres totaux (N|, N p N B ) de chacun des trois types d'images dans chaque dit ensemble pour 
produire des attributions qui satisfont ledit debit binaire cible fixe. 

Procede selon la revendication 1, dans lequel I'etape d'attribution est effectuee en conformite avec les equations 
suivantes 

C set = n l C l +n P C P + n B C B : 



W ^ - C 

D, C ' 



dans lesquelles W P et W B sont des termes de ponderation pour desaccentuer I'attribution de bits d'image P et 
d'image B par rapport aux autres et E r ' est une erreur de compensation de mouvement moyenne produite dans une 
reconstruction d'image a mouvement compense. 

Procede selon la revendication 1 , dans lequel I'attribution des bits pour chaque type d'image est proportionnelle au 
degre de difficulte dans le codage par compression de ce type d'image. 

Procede selon la revendication 1 , dans lequel le degre de difficulte de codage par compression de chaque type 
d'image est determine en utilisant la difficulte de codage par compression de donnees de pixels de I'image qui doit 
etre codee et d'au moins une image dudit ensemble deja codee. 

Procede selon la revendication 1 , dans lequel le degre de difficulte de codage par compression de chaque type 
d'image est determine en utilisant la difficulte de codage par compression des donnees de difference de pixels de 
I'image qui doit etre codee et d'au moins une image dudit ensemble deja codee. 

Systeme pour I'attribution de bits qui doivent etre utilises pour le codage par compression de signaux de donnees 
numeriques representant un ensemble ou des ensembles d'images dans une sequence de video animee sous la 
forme d'un train de donnees M-PEG II comprenant : 

un moyen pour identifier chaque image dans ledit ensemble ou lesdits ensembles qui doit etre codee par com- 
pression comme un des trois types I, P, B ; 
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un moyen (1 a 3) pour determiner le nombre total de bits (B set ) qui doivent etre utilises dans le codage par 
compression de chaque ensemble d'images sur la base d'un debit binaire cible fixe pour chaque sequence 
dans lequel ledit nombre de bits total (B set ) est egal a une attribution de bit moyenne (B avg ) multipliee par le 
nombre total d'images I (n|), d'images P (n P ) et d'images B (n B ), se produisant dans ledit ensemble d'une 
maniere telle que B set = (n , + n P + n B ) • B avg ; un moyen pour soustraire de preference dudit nombre total 
de bits (B set ) une estimee (S set ) du nombre de bits requis pour coder les informations secondaires dans cha- 
que ensemble ayant pour resultat un nombre total modifie de bits (C S9t ) ; 

un moyen (2) pour attribuer a partir dudit nombre total de bits (C set ou B set ), des bits pour utilisation dans le 
codage par compression d'une image dans chaque ensemble en determinant les attributions (C|, Cp C B ) pour 
chaque type d'image dans I'ensemble avant le codage par compression de chaque image en utilisant 

1) le degre de difficulty (D|, D p D B ) du codage par compression de chaque type d'image, ledit degre de 
difficulte correspondant a la complexite spatiale d'une image respective d'une maniere telle que I'attribu- 
tion de bits (C h C p C B ) pour un type d'image respectif augmente ou diminue avec un degre croissant ou 
decroissant de difficulte (D,, D p D B ) du codage par compression dudit type d'image respectif, et 

2) lesdits nombres totaux (N h N p N B ) de chacun des trois types d'images dans chaque dit ensemble pour 
produire des attributions qui satisfont ledit debit binaire cible fixe. 

Systeme selon la revendication 6, dans lequel ledit moyen d'attribution (2) est mis en oeuvre en conformite avec 
les equations suivantes 

C se t = n | C | + n pC p + n B C B ; 




dans lesquelles W P et W B sont des termes de ponderation pour desaccentuer I'attribution de bits d'image P et 
d'image B par rapport aux autres et E r ' est une erreur de compensation de mouvement moyenne produite dans une 
reconstruction d'image a mouvement compense. 
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FIG. 2 
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